Conversation
This skill explain Copilot agents what Fabric Lakehouse is and what features and capabilities it has.
There was a problem hiding this comment.
Pull request overview
This pull request adds a new skill for Microsoft Fabric Lakehouse to help AI agents provide accurate information when working with Fabric Lakehouse-related tasks. The skill provides comprehensive documentation about Lakehouse concepts, architecture, security, performance optimization, and code examples.
Changes:
- Added a new
fabric-lakehouseskill with SKILL.md and reference documentation - Included PySpark code examples for common Lakehouse operations (reading/writing data, Delta operations, optimization)
- Provided Data Factory integration patterns for ETL/ELT orchestration
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 14 comments.
| File | Description |
|---|---|
| skills/fabric-lakehouse/SKILL.md | Main skill definition with frontmatter, core concepts, security, shortcuts, optimization, and lineage information |
| skills/fabric-lakehouse/references/pyspark.md | PySpark code examples for Spark configuration, data reading/writing, Delta operations, schema definition, and optimization patterns |
| skills/fabric-lakehouse/references/getdata.md | Data Factory integration documentation including connectors, pipeline activities, and orchestration patterns |
| ### Tabular data in a Lakehouse | ||
|
|
||
| Tabular data in a form of tables are stored under "Tables" folder. Main format for tables in Lakehouse is Delta. Lakehouse can store tabular data in other formats like CSV or Parquet, these formats only available for Spark querying. | ||
| Tables can be internal, when data is stored under "Tables" folder" or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referecing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. |
There was a problem hiding this comment.
Spelling error: "Referecing" should be "Referencing".
| Tables can be internal, when data is stored under "Tables" folder" or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referecing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. | |
| Tables can be internal, when data is stored under "Tables" folder" or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referencing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. |
|
|
||
| ### Schemas for tables in a Lakehouse | ||
|
|
||
| When creating a lakehouse user can choose to enable schemas. Schemas are used to organize Lakehouse tables. Schemas are implemented as folders under "Tables" folder and store tables inside of those folders. Default schema is "dbo" and it can't be deleted or renamed. All other schemas are optional and can be created, renamed, or deleted. User can reference schema located in other lakehouse using Schema Shortcut that way referincing all tables with one shortcut that are at the destination schema. |
There was a problem hiding this comment.
Spelling error: "referincing" should be "referencing".
| When creating a lakehouse user can choose to enable schemas. Schemas are used to organize Lakehouse tables. Schemas are implemented as folders under "Tables" folder and store tables inside of those folders. Default schema is "dbo" and it can't be deleted or renamed. All other schemas are optional and can be created, renamed, or deleted. User can reference schema located in other lakehouse using Schema Shortcut that way referincing all tables with one shortcut that are at the destination schema. | |
| When creating a lakehouse user can choose to enable schemas. Schemas are used to organize Lakehouse tables. Schemas are implemented as folders under "Tables" folder and store tables inside of those folders. Default schema is "dbo" and it can't be deleted or renamed. All other schemas are optional and can be created, renamed, or deleted. User can reference schema located in other lakehouse using Schema Shortcut that way referencing all tables with one shortcut that are at the destination schema. |
|
|
||
| ### Fabric Materialized Views | ||
|
|
||
| Set of pre-computed tables that are automatically updated based on schedule. They provide fast query performance for complex aggregations and joins. Materialized views are defined using PySpark or Spark SQL stored in asociated Notebook. |
There was a problem hiding this comment.
Spelling error: "asociated" should be "associated".
| Set of pre-computed tables that are automatically updated based on schedule. They provide fast query performance for complex aggregations and joins. Materialized views are defined using PySpark or Spark SQL stored in asociated Notebook. | |
| Set of pre-computed tables that are automatically updated based on schedule. They provide fast query performance for complex aggregations and joins. Materialized views are defined using PySpark or Spark SQL stored in associated Notebook. |
|
|
||
| ### Data access or OneLake Security | ||
|
|
||
| For data access use OneLake security model, which is based on Azure Active Directory (AAD) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In adition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. |
There was a problem hiding this comment.
Spelling error: "adition" should be "addition".
| For data access use OneLake security model, which is based on Azure Active Directory (AAD) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In adition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. | |
| For data access use OneLake security model, which is based on Azure Active Directory (AAD) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In addition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. |
|
|
||
| ### V-Order Optimization | ||
|
|
||
| For faster data read with semantic model enable V-Order optimization on Delta tables.This presorts data in a way that improves query performance for common access patterns. |
There was a problem hiding this comment.
Missing space after period. Should be "tables. This" instead of "tables.This".
| For faster data read with semantic model enable V-Order optimization on Delta tables.This presorts data in a way that improves query performance for common access patterns. | |
| For faster data read with semantic model enable V-Order optimization on Delta tables. This presorts data in a way that improves query performance for common access patterns. |
|
|
||
| ### Files in a Lakehouse | ||
|
|
||
| Files are stored uner "Files" folder. Users can create folders and subfolders to organize their files. Any file format can be stored in Lakehosue. |
There was a problem hiding this comment.
Spelling error: "uner" should be "under".
| Files are stored uner "Files" folder. Users can create folders and subfolders to organize their files. Any file format can be stored in Lakehosue. | |
| Files are stored under "Files" folder. Users can create folders and subfolders to organize their files. Any file format can be stored in Lakehosue. |
|
|
||
| ### Files in a Lakehouse | ||
|
|
||
| Files are stored uner "Files" folder. Users can create folders and subfolders to organize their files. Any file format can be stored in Lakehosue. |
There was a problem hiding this comment.
Spelling error: "Lakehosue" should be "Lakehouse".
| ### Tabular data in a Lakehouse | ||
|
|
||
| Tabular data in a form of tables are stored under "Tables" folder. Main format for tables in Lakehouse is Delta. Lakehouse can store tabular data in other formats like CSV or Parquet, these formats only available for Spark querying. | ||
| Tables can be internal, when data is stored under "Tables" folder" or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referecing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. |
There was a problem hiding this comment.
Grammar issue: Extra quotation mark in the phrase. Should be 'under "Tables" folder' instead of 'under "Tables" folder"'. The closing quote after "folder" should be removed.
| Tables can be internal, when data is stored under "Tables" folder" or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referecing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. | |
| Tables can be internal, when data is stored under "Tables" folder or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Referecing tables are done through Shortcuts, which can be internal, pointing to other location in Fabric, or external pointing to data stored outside of Fabric. |
|
|
||
| ### Data access or OneLake Security | ||
|
|
||
| For data access use OneLake security model, which is based on Azure Active Directory (AAD) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In adition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. |
There was a problem hiding this comment.
Outdated terminology: "Azure Active Directory (AAD)" has been rebranded to "Microsoft Entra ID". Consider updating the terminology to reflect the current product name.
| For data access use OneLake security model, which is based on Azure Active Directory (AAD) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In adition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. | |
| For data access use OneLake security model, which is based on Microsoft Entra ID (formerly Azure Active Directory) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In adition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. |
|
|
||
| ### Key Components | ||
|
|
||
| - **Delta Tables** Managed tables with ACID compliance and schema enforcement |
There was a problem hiding this comment.
Inconsistent list formatting: "Delta Tables" on line 27 is missing a separator (colon, dash, or em dash) between the term and its description. For consistency with other list items like "Files" (line 28) and "SQL Endpoint" (line 29), consider adding a separator such as a colon or dash after "Delta Tables".
| - **Delta Tables** Managed tables with ACID compliance and schema enforcement | |
| - **Delta Tables**: Managed tables with ACID compliance and schema enforcement |
Pull Request Checklist
npm startand verified thatREADME.mdis up to date.Description
This is a skill explaining to agents what Fabric Lakehouse is. I'm a product manager of Microsoft Fabric team owning Lakehouse experience. When using Copilot to generate PRD or Code that is related to Lakehouse without this skill you get some discrepancies with the actual product. This skill will let agents to be more accurate when working on fabric Lakehouse.
Type of Contribution
Additional Notes
I'll continue maintaining and updating this skill as Fabric Lakehouse evolves.
By submitting this pull request, I confirm that my contribution abides by the Code of Conduct and will be licensed under the MIT License.